A non-parametric method for building predictive genetic tests on high-dimensional data.

نویسندگان

  • Chengyin Ye
  • Yuehua Cui
  • Changshuai Wei
  • Robert C Elston
  • Jun Zhu
  • Qing Lu
چکیده

OBJECTIVE Predictive tests that capitalize on emerging genetic findings hold great promise for enhanced personalized healthcare. With the emergence of a large amount of data from genome-wide association studies (GWAS), interest has shifted towards high-dimensional risk prediction. METHODS To form predictive genetic tests on high-dimensional data, we propose a non-parametric method, called the 'forward ROC method'. The method adopts a computationally efficient algorithm to search for environment risk factors, genetic predictors on the entire genome, and their possible interactions for an optimal risk prediction model, without relying on prior knowledge of known risk factors. An efficient yet powerful procedure is also incorporated into the method to handle missing data. RESULTS Through simulations and real data applications, we found our proposed method outperformed the existing approaches. We applied the new method to the Wellcome Trust rheumatoid arthritis GWAS dataset with a total of 460,547 markers. The results from the risk prediction analysis suggested important roles of HLA-DRB1 and PTPN22 in predicting rheumatoid arthritis. CONCLUSION We proposed a powerful and robust approach for high-dimensional risk prediction. The new method will facilitate future risk prediction that considers a large number of predictors and their interaction for improved performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Predictive Ability of Statistical Genomic Prediction Methods When Underlying Genetic Architecture of Trait Is Purely Additive

A simulation study was conducted to address the issue of how purely additive (simple) genetic architecture might impact on the efficacy of parametric and non-parametric genomic prediction methods. For this purpose, we simulated a trait with narrow sense heritability h2= 0.3, with only additive genetic effects for 300 loci in order to compare the predictive ability of 14 more practically used ge...

متن کامل

Comparison of Ordinal Response Modeling Methods like Decision Trees, Ordinal Forest and L1 Penalized Continuation Ratio Regression in High Dimensional Data

Background: Response variables in most medical and health-related research have an ordinal nature. Conventional modeling methods assume predictor variables to be independent, and consider a large number of samples (n) compared to the number of covariates (p). Therefore, it is not possible to use conventional models for high dimensional genetic data in which p > n. The present study compared th...

متن کامل

Optimal Pareto Parametric Analysis of Two Dimensional Steady-State Heat Conduction Problems by MLPG Method

Numerical solutions obtained by the Meshless Local Petrov-Galerkin (MLPG) method are presented for two dimensional steady-state heat conduction problems. The MLPG method is a truly meshless approach, and neither the nodal connectivity nor the background mesh is required for solving the initial-boundary-value problem. The penalty method is adopted to efficiently enforce the essential boundary co...

متن کامل

A Parametric Study on the Progressive Collapse Potential of Steel Buildings under Truck Collision

In this paper, the initiation and propagation of structural damage in a building due to the truck collision to one of its corner columns were investigated. For this purpose, a three-dimensional 4-story moment resisting steel frame with intermediate ductility was considered. The structure was designed using ETABS software under standard dead, live, and earthquake loads, and then impact loading w...

متن کامل

Simulation of Smoke Emission from Fires in High-Rise Buildings Using the 3D Model Generated from 2-Dimensional Cadastral Data

Having a 3-Dimensional model of high-rise buildings can be used in disaster management such as fire cases to reduce casualties. The fundamental dilemma in 3D building modeling is the unavailability of suitable data sources. However, available cadastral 2D maps could be used as low-cost and attainable resources for 3D building modeling. Smoke will be a great threat to people's health during a f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Human heredity

دوره 71 3  شماره 

صفحات  -

تاریخ انتشار 2011